Heuristic-Based Scheduling to Maximize Throughput of Data-Intensive Grid Applications

نویسندگان

  • Souvik Ray
  • Zhao Zhang
چکیده

Job scheduling in data grids must consider not only computation loads at each grid node but also the distributions of data required by each job. Furthermore, recent trends in grid applications emphasize high throughput more than high performance. In this paper, we propose a centralized scheduling scheme, which uses a scheduling heuristic called Maximum Residual Resource (MRR) that targets high throughput for data grid applications. MRR takes into account both computation times and data resources available at a site when evaluating the site as a candidate for a given job. It tries to minimize the number of job executions that require remote data transfer, meanwhile maintaining good load balance among grid sites. We have analyzed the performance potentials of MRR, and have developed a simulator to evaluate it with typical grid configurations. Our results show that MRR brings significant performance improvements over existing online and batch heuristics like MCT, Min-min and Max-min.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

A New Job Scheduling in Data Grid Environment Based on Data and Computational Resource Availability

Data Grid is an infrastructure that controls huge amount of data files, and provides intensive computational resources across geographically distributed collaboration. The heterogeneity and geographic dispersion of grid resources and applications place some complex problems such as job scheduling. Most existing scheduling algorithms in Grids only focus on one kind of Grid jobs which can be data...

متن کامل

Data Replication-Based Scheduling in Cloud Computing Environment

Abstract— High-performance computing and vast storage are two key factors required for executing data-intensive applications. In comparison with traditional distributed systems like data grid, cloud computing provides these factors in a more affordable, scalable and elastic platform. Furthermore, accessing data files is critical for performing such applications. Sometimes accessing data becomes...

متن کامل

Stability Assessment Metamorphic Approach (SAMA) for Effective Scheduling based on Fault Tolerance in Computational Grid

Grid Computing allows coordinated and controlled resource sharing and problem solving in multi-institutional, dynamic virtual organizations. Moreover, fault tolerance and task scheduling is an important issue for large scale computational grid because of its unreliable nature of grid resources. Commonly exploited techniques to realize fault tolerance is periodic Checkpointing that periodically ...

متن کامل

A throughput maximization strategy for scheduling transaction-intensive workflows on SwinDeW-G

With the rapid development of e-business, workflow systems now have to deal with transaction-intensive workflows whose main characteristic is the huge number of concurrent workflow instances. For such workflows, it is important to maximize the overall throughput to provide good quality of service. However, most of the existing scheduling algorithms are designed for scheduling of a single comple...

متن کامل

Grid Resource Management and Scheduling for Data Streaming Applications 1001 GRID RESOURCE MANAGEMENT AND SCHEDULING FOR DATA STREAMING APPLICATIONS

Data streaming applications bring new challenges to resource management and scheduling for grid computing. Since real-time data streaming is required as data processing is going on, integrated grid resource management becomes essential among processing, storage and networking resources. Traditional scheduling approaches may not be sufficient for such applications, since usually only one aspect ...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2004